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Abstract 



We discuss problems posed by the quantitative study of time inho- 
mogeneous Markov chains. The two main notions for our purpose are 
merging and stabihty. Merging (also called weak ergodicity) occurs when 
the chain asymptotically forgets where it started. It is a loss of memory 
property. Stability relates to the question of whether or not, despite tem- 
porary variations, there is a rough shape describing the long time behavior 
of the chain. For instance, we will discuss an example where the long time 
behavior is roughly described by a binomial, with temporal variations. 

1 Introduction 

As is apparent from most text books, the definition of a Markov process includes, 
in the most natural way, processes that are time inhomogeneous. Nevertheless, 
most modern references quickly restrict themselves to the time homogeneous 
case by assuming the existence of a time homogeneous transition function, a 
case for which there is a vast literature. 

The goal of this paper is to point out some interesting problems concern- 
ing the quantitative study of time inhomogeneous Markov processes and, in 
particular, time inhomogeneous Markov chains on finite state spaces. Indeed, 
almost nothing is known about the quantitative behavior of time inhomogeneous 
chains. Even the simplest examples resist analysis. We describe some precise 
questions and examples, and a few results. They indicate the extent of our lack 
of understanding, illustrate the difficulties and, perhaps, point to some hope for 
progress. 

We think the problems discussed below have an intrinsic mathematical in- 
terest (indeed, some of them appear quite hard to solve) and are very natural. 

* Research partially supported by NSF grant DMS 0603886 

tResearch partially supported by NSF grants DMS 0603886, DMS 0306194, DMS 0803018 
and by a NSF Postdoctoral Fellowship. 



Nevertheless, it is reasonable to ask whether or not time inhomogeneous chains 
are relevant in some applications. Most of the recent interest in Markov chains 
is related to Monte Carlo Markov Chain algorithms. In this context, one seeks 
a Markov chain with a given stationary distribution. Hence, time homogeneity 
is rather natural. See, e.g., Still, one of the popular algorithms of this 

sort, the Gibbs sampler, can be viewed as a time inhomogeneous chain (one 
that, despite huge amount of attention, is still resisting analysis). Time inho- 
mogeneity also appears in the so-called simulated annealing algorithms. See 
[T^ for a discussion that is close in spirit to the present work and for older 
references. However, certain special features of each of these two algorithms 
distinguish them from the more basic time inhomogeneous problems we want to 
discuss here. Namely, in the Gibbs sampler, each individual step is not ergodic 
(it involves only one coodinate) whereas, in the simulated annealing context, 
the time inhomogeneity vanishes asymptotically. Other interesting stochastic 
algorithms that present time inhomogeneity are discussed in [lOj . 

In many applications of finite Markov chains, the kernel describes transitions 
between different classes in a population of interest. Assuming that these tran- 
sition probabilities can be observed empirically, one application is to compute 
the stationary measure which describes the steady state of the system. Exam- 
ples of this type include models for population migrations between countries, 
models for credit scores used to study the default risk of certain loan portfolios, 
etc. In such examples, it is natural to consider cases when the Markov kernel 
describing the evolution of the system depends on time in either a determin- 
istic or a random manner. The reason for the time inhomogeneity may come, 
for example, from seasonal factors. Or it may model various external events 
that are independent of the state of the system. Even if one decides that time 
homogeneity is warranted, one may wish to study the possible effects of small 
but non-vanishing time dependent perturbations of the model. It seems rather 
important to understand whether or not such perturbations can drastically alter 
the behavior of the underlying model. This type of practical questions fit nicely 
with the theoretical problems discussed below. 

A large class of natural examples of time inhomogeneous chains comes from 
time inhomogeneous random walks on groups. These are discussed in [H[2H]- -A- 
special case is the semi-random transpositions model discussed in [l4l [211 [22l [28] . 

2 Merging and stability 

This section introduces the two main properties we want to focus on: merging 
(in total variation or relative-sup) and stability. Given two Markov kernels 
Ki,K2, we set 

Z 

Given a sequence (Ki)'^ and < m < n, we set 
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2.1 Merging 

Recall that an aperiodic irreducible Markov kernel K on a, finite state space 
admits a unique invariant probability measure tt. Further, for any starting 
measure iiq and any large time n, the distribution fin — fJ-o^^ at time n is both 
essentially independent from the starting distribution fxo and well approximated 
by TT. 

Consider now the evolution of a system started according to an initial dis- 
tribution yUo and driven by a sequence of Markov kernels so that, at time 
n, the distribution is /i„ = fioKiK2 ■ • ■ Kn- In [U 1^ such a sequence {p-n)T of 
probability measures is called a "set of absolute probabilities" but we will not 
use this terminology here. In many cases, for very large n, the distribution ^„ 
will be essentially independent of the initial distribution /xq. Namely, if /^OiA*o 
are two initial distributions and /z„ — fioKi ■ ■ ■ K^, /i^ = Mo-^i ' ' ' ^n, then it 
will often be the case that 



hm \\fJ,n- ^J■r, 



0. 



We call this loss of memory property merging (total variation merging, to be 
more precise). 

One may also want to know whether or not 



lim sup 



1 



= 0. 



We call this later property relative-sup merging. Total variation merging is often 
discussed under the name of "weak ergodicity". See, e.g., pi l4l l6l [TSl [TBI [T8l [24] . 
We think "merging" is more appropriate. 

If there is merging, then one may want to ask quantitative questions about 
the merging time. For any e G (0, 1), we set 



Ti(e) = inf {n : V/xo,Aio, liMn - Mnlkv < e} 



and 



Too(e) = inf < n : V/io,Mo 



< e 



(2.1) 



(2.2) 



The next definition introduces the collective notions of merging and merging 
time for a given set Q of Markov kernels. 

Definition 2.1. Let Q be a set of Markov kernels on a finite state space. We 
say that Q is merging in total variation (resp. relative-sup) if any sequence 
(Ki)"^ of kernels in Q is merging in total variation (resp. relative-sup). We 
say that Q has total-variation (resp. relative-sup) e-merging time at most T(e) 
if the total variation (resp. relative-sup) e-merging time ()2.ip (resp. (|2.2p ) is 
bounded above by T(e), for any sequence {Ki)f of kernels in Q. 

Let us emphasize that, from the view point of the present work, it is more 
natural to think in terms of properties shared by all sequences drawn from a set 
of kernels than in terms of properties of some particular sequence. 
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2.2 Stability 



In the previous section, the notion of merging was introduced as a natural gen- 
erahzation of the loss of memory property in the time inhomogeneous context. 
The notion of stability introduced below is a generalization of the existence of 
a positive invariant distribution. 

Definition 2.2. Fix c > 1. Given a Markov chain driven by a sequence of 
Markov kernels {Ki)^, we say that a probability measure tt is c-stable (for 
{Ki)f') if there exists a positive measure such that the sequence /x" = iioKo,n 
satisfies 



When such a measure tt exists, we say that {Ki)f^ is c-stable. 

Example 2.3. Let K be an irreducible aperiodic kernel. Then the chain driven 
by K is 1-stable. Indeed, it admits a positive invariant measure tt and ttK"' = tt. 
Further, for any probability measure /iq with \\{fio/Tr) — l\\oo < e, the sequence 
fin — Aio^", n — 1,2, ... , satisfies (1 — e)7r < fin < {l + e)n. Indeed, in the space 
of signed measures, the linear map /i i— >■ nK is a contraction for the distance 



In the next definition, we consider the notion of c-stability for a family Q of 
Markov kernels on a fixed state space. This definition is of interest even in the 
case when Q = {Qi,Q2} is a pair. 

Definition 2.4. Fix c > 1. Given a set Q of Markov kernels on a fixed state 
space, we say that a probability measure tt is a c-stable measure for Q if there 
exists a positive measure such that for any choice of sequence {Ki)f in Q, 
the sequence /i„ — fioKo n satisfies 



When such a measure tt exists, we say that Q is c-stable. 

Example 2.5. Assume the state space is a group G and let Q be the set of all 
Markov kernels Q such that Q{zx, zy) = Q{x, y) for all x,y,z G G. This set is 
1-stable with 1-stable measure u, the uniform measure on G. 

Example 2.6. On the two-point space, a finite set Q of Markov kernels is c- 



stable if and only if it contains no pairs {Qi, Q2\ with *3i ~ ^ ^ 'fo br ^ j 

such that Qi ^ Q2, ai — 0, b2 — 0. This condition is clearly necessary. It is not 
immediately obvious that it is sufficient. See [29] . 

Remark 2.7. Consider the problem of deciding whether or not a pair Q = 
{Qi,Q2} of two irreducible ergodic Markov kernels with invariant measure 
7ri,7r2, respectively, is c-stable. This can be pictured by considering a rooted 
infinite binary tree with edges labeled Qi{=\eft) and Q2(=i'ight) as on Figure 
[TJ Obviously, any sequence {Ki)f^ with Ki £ Q corresponds uniquely to an end 



C TT < /!„ < CTT. 



d{fi,u) = \Mn)-{,./n)\\ 



00 ■ 



C TT < /i„ < CTT. 
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to G fl where n denotes the set of the ends the tree. Given an initial measure 
fiQ (placed at the root), the measure /x!;^ = /xo-^^o.n is obtained by following u) 
from the root down to level n. Thus, for each choice of /xq, we obtain a tree 
with vertices labeled with measures. 



Figure 1: The Qi,Q2 tree 




i 

TTl fi 7r2 



The qiiestion of c-stability is the problem of finding an initial measure /io 
which, in some sense, minimizes the variations among the /^^'s. At the left-most 
and right-most ends Wi, u)2, we get /zj^' — ?• tt^. Note that, if Qi,Q2 share the 
same invariant measure tti = tt2 = tt, then the choice = t: yields a tree all of 
whose vertices are labeled by tt. The existence of a c-stable measure /io can be 
viewed as a weakening of this. The difhculty is that the existence of an invariant 
measure and thus the equality between tti and 1^2 can be viewed as an algebraic 
property whereas there seems to be no algebraic tools to study c-stability. 

2.3 Simple results and examples 

Wo are interested in finding conditions on the individual kernels Ki of a se- 
quence {Kn)'x' that imply merging. This is not obvious even if we consider 
the very special case when all the KiS are drawn from a finite set of kernels 
Q = {<3o, • • • , Qm} or even from a pair Q = {Qo, Q\}- 

• Suppose that Qo,Qi are irreducible and aperiodic. Does it imply any 
sequence {Ki)f drawn from Q = {QojQi} is merging? 

The answer is no. Let ttq be the invariant measure of Qo and let Qi = Qq be the 
adjoint of Qo on £^(7ro). If {Qot'Ko) is not reversible (i.e., Qo is not self-adjoint 
on £^(7ro)) then it is possible that QoQq is not irreducible. When QqQq is not 
irreducible, the sequence Ki = Qi mod 2 is not merging. 

• Suppose that Qo, Qi are reversible, irreducible and aperiodic. Does it im- 
ply any sequence {Ki)f drawn from Q = {Qo, Q\} is merging in relative- 
sup? 

The answer is no, even on the two point space! On the two point space, Q = 
{Qo, Qi) is merging in total variation as long as Qq, Qi are irreducible aperiodic 



5 



but relative sup merging fails for the irreducible aperiodic pairs of the type 

with < a,6 < 1. See 

The following examples are instructive. 

Example 2.8. On S = {1,...,5} consider the reversible kernels Qo,Qi cor- 
responding to the graphs in Figure [2] (all edges have weight 1). Consider the 
sequence Ki — Qi mod 2 so that Ki — Qi,K2 — QojK^ = Qi,.... If, at an 
even time n — 2i, the chain is at states 2 or 5 then from that time on, the 
chain will be in {2, 5} at even times and in {3, 4} at odd times. In this example, 
the chain driven by {Ki)f^ is merging in total variation but is not merging in 
relative-sup. 



Figure 2: A five-point example 




Figure 3: A seven-point example 



5 4 




■ • 
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Example 2.9. The kernels depicted in Figure [3] yield an example where total 
variation (hence, a fortiori, relative-sup) merging fails. In this example, the 
sequence (Ki)"^ with Ki — Qi mod 2 fails to be merging in total variation be- 
cause the chain will eventually end up oscillating either between 2 and 1, or 
between {4,7} and {5,6}, with a preference for one or the other depending on 
the starting distribution fiQ. 

Let us give two simple results concerning merging. 

Proposition 2.10. Assume that, for each i, there exists a state yi and a real 
ti G (0, 1) such that 

Vcc, Ki{x,yi) > Ei. 
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If 6^ = 00 then the sequence (Ki)^ is merging in total variation. If, in 
addition, each Ki is irreducible then the sequence {Ki)'^ is also merging in 
relative-sup. 

Proof. For total variation, this can be proved by a well-known Doeblin's coupling 
argument (see, e.g., [131 [22]) and irreducibility of the kernels is not needed. Of 
course, the mass might ultimately concentrate on a fraction of the state space. 

Merging in relative-sup is a bit more subtle and irreducibility is needed for 
that conclusion to hold (even in the time homogeneous case). A proof using 



Remark 2.11. Under the much stronger hypothesis yx,y, Ki{x,y) > > 0, 
one gets an immediate control of any sequence = HoKo,n, n = 1,2,..., in 
the form 



where A'' is the size of the state space. 

Remark 2.12. The hypothesis 3yi,Va;, Ki{x,yi) > Ci > 0, is obviously too 
strong in many cases but it can often be applied to study a time inhomogeneous 
chain {Ki)f by grouping terms and considering the sequence Qi — Kn.^m+i 
for an appropriately chosen increasing sequence n^. In the simplest case, for 
a given sequence {Ki)f, one seeks e 6 (0,1) and an integer m such that 
Kim,im+m{x,y) > £ for all x,y,i. When such a lower bound holds, one con- 
cludes that (1) the chain is merging in total variation and relative-sup and (2) 
there exists c G (0, 1) such that for any starting measure /io and n large enough, 
the measures /i„ — fJ.oKo,n satisfy c < ^in{z) < 1 — c. However, this type of 
argument is bound to yield very poor quantitative results in most cases. 

For the next result, recall that an adjacency matrix ^ is a matrix whose 
entries are either or 1. 

Proposition 2.13. On a finite state space let {Ki)f he a sequence of Markov 
kernels. Assume that: 

1. (Uniform irreducibility) There exist an£, e (0, 1) and adjacency matrices 
(Ai)^ , such that, yi,x,y, A\{x,y) > and Ki{x,y) > eAi{x,y). 

2. (Uniform laziness) There exists rj G (0, 1) such that, Vi,x, Ki{x,x) > rj. 

Then the chain driven by (AT^)^ is merging in total variation and relative- 
sup norm. Moreover, there exists Uq and c € (0,1) such that for any starting 
distribution /io, all n > uq and all z, /z„ = ^ioK^^n satisfies ^n{z) G (c, 1 — c). 

Proof. Let TV be the size of the state space. Using (l)-(2), one can show (see [55] ) 
that Kn.n+N{x,y) > (min{e, 77})^^^. The desired result follows from Proposi- 
tion [mUj and Remark [2H □ 

Note that this argument can only give very poor quantitative results! 




□ 



Vz, tn< rah\{Kn{x, y)} < iin{z) < sup{A:„(x, y)} < I - {N - l)e, 



-n 
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3 A short review of the hterature 



The largest body of literature concerning time inhomogeneous Markov processes 
come, perhaps, from the analysis of Patial Differential Equations where time de- 
pendent coefhcients are allowed. The book ^36) can serve as a basic reference. 
Unfortunately, it seems that the results developed in that context are local in 
nature and are not very relevent to the quantitative problems we are interested 
in. The literature on (finite) time inhomogeneous Markov chains can be orga- 
nized under three basic headings: Weak ergodicity, asymptotic structure, and 
products of stochastic matrices. We now briefly review each of these directions. 

3.1 Weak ergodicity 

One of the earliest references concerning the asymptotic behavior of time inho- 
mogeneous chains is a note of Emile Borel [2] where he discusses time inhomo- 
geneous card shufflings. In the context of general time inhomogeneous chains 
on finite state spaces, weak ergodicity, which we call total variation merging, 
i.e., the tendency to forget the distant past, was introduced in [19] and is the 
main subject of [T6| . See also and the reference to the work of Doeblin given 
there. A sample of additional old and not so old references in this direction is 
[151 [m [21 [Ml US 132]. An historical review is given in [33]. The main tools 
developed in these references to prove weak ergodicity are the use of ergodic 
coefficients and couplings. A modern perspective, close in spirit to our inter- 
ests, is in [lOl [Til IB]- It rnay be worth pointing out that, by design, ergodic 
coefficients mostly capture some asymptotic properties and are not well suited 
for quantitative results, even in the time homogeneous case. 

3.2 Asymptotic structure 

One of the basic results in the theory of time homogeneous finite Markov chains 
describes the decomposition of the state space into non-essential (or transient) 
states, essential classes and periodic subclasses. It turns out that, perhaps 
surprisingly, there exists a completely general version of this result for time 
inhomogeneous chains. This result is rather more subtle than its time homoge- 
neous counterpart. Sonin [331 Theorem 1] calls it the Decomposition- Separation 
Theorem and reviews its history which starts with a paper of Kolmogorov [19] , 
with further important contributions by Blackwell [1], Cohn [4] and Sonin [34] . 

Fix a sequence {Kn)f of Markov kernels on a finite state space f2. The 
Decomposition-Separation Theorem yields a sequence {{S^, k = 0, . . . c})'^^i of 
partitions of 57 so that: (a) With probability one, the trajectories of any Markov 
chain (X„) driven by (Kn)"^ will, after a finite number of steps, enter one of 
the sequence S'^ = {Sn)^=iJ k = l,...,c, and stay there forever. Further, for 
each k, 

oo 

P(X„ e S^; Xn+i ^ 5*^+1) + P{Xn ^ S^; Xn+i e 5*^+1) < 00. 

n=l 
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(b) For each k = 1, . . . , c, and for any two Markov chains {X^)^, 
driven by (i^n)i° such that liuin^ooPiX^ S 5^) > 0, and any sequence of 
stcitcs Xn G , 

The sequence (5°) 5" describes "non-essential states" and a chain is weakly 
ergodic (i.e., merging in total variation) if and only if c = 1, i.e., there is only 
one essential class. We refer the reader to (34] for a detailled discussion and 
connections with other problems. 

The Decomposition-Separation Theorem can be illustrated (albeit, in a rather 
trivial way) using Example 12.91 of Figure [3] above. In this case, ft — {1, . . . , 7}. 
We consider the sequence of partitions (S'^),/cG{0,1,2}, where — {1,3,5,6}, 
50„+i = {2,3,4,7}, = {2}, = {1} and 5|„ = {4,7}, Sl„+, = {5,6}. 

Any chain driven by Qi, Qq, Qi, . . . will eventually end up staying either in 
or in forever. 

The Decomposition-Separation Theorem is a very general result which holds 
without any hypothesis on the kernels Kn. We are instead interested in finding 
hypotheses, perhaps very restrictive ones, on the individual kernels Kn that 
translate into strong quantitative results concerning the merging property of 
the chain. 

3.3 Products of stochastic matrices 

There is a rather rich literature on the study of products of stochastic matrices. 
Recall that stochastic matrices are matrices with non-negative entries and row 
sums equal to 1. This last assumption, which breaks the row/column symme- 
try, implies that there is significant differences between forward and backward 
products of stochastic matrices. Given a sequence Ki of stochastic matrices The 
forward products form the sequence 

kI,.^ = K1K2 ■ ■ ■ n = l,..., 

whereas the backward products form the sequence 

Kln = K„---K2Ki, n = 

There is a crucial difi^erence between these two sequences: The entries K^^{x, y) 
do not have any general monotonicity properties but, for any y, 

n ^ M{n,y) = max{Xo (a;,y)} 

X 

is monotone non-increasing and 

n 1-^ m{n, y) = m:in{Kl {x, y)} 

X 

is monotone non-decreasing. These properties are obvious consequences of the 
fact that the matrices Ki are stochastic matrices. Of course, lim„_^oo M (n, y) 
and lim„_i.oc m{n, y) exist for all y. 
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If, for some reason, we know that 

y 

then it follows that the backward products converge to a row-constant matrix 
n, i.e., 

\/x,x',y, U{x,y)= lim T^q (x,?/), U{x , y) = U{x' , y) . 

n— f oo 

The references [TBI UHl 1221 1151 [SB [3S] form a sample of old and recent works 
dealing with this observation. 

Changing viewpoint and notation somewhat, consider all finite products 
of matrices drawn from a set Q oi N x N stochastic matrices. For uj = 
(. . . , Ki-i, Ki, Ki+i, . . . ) G a doubly infinite sequence of matrices and m < 
n G Z, set 

A stochastic matrix is called (SIA) if its products converge to a constant row 
matrix. Here, (SIA) stands for stochastic, irreducible and aperiodic although 
"irreducible" really means that the matrix has a unique recurrent class (tran- 
sient states are allowed so that the constant row limit matrix may have some 
columns). A central result in this area (e.g., [35J I3H]) is that, if Q is finite 
and all finite products of matrices in Q are (SIA) then, for any doubly infinite 
sequence w G Q^, 

lim ^|if-.„(x,y)-i^^,,„(x',y)|=0 (3.3) 
y 

and 

lim i^r„,„ = n- (3.4) 

where 11'^ is a row-constant matrix. Let tt^ be the probability measure cor- 
responding to the rows of row-constant matrix Ilf^. Observe that p.3p - p.4p 
imply 

lim Y,\KlJx,y)-7r^{y)\^0. 
y 

The following proposition establishes some relations between these consid- 
erations, total variation merging and stability. 

Proposition 3.1. Let Q be a set of N x N stochastic matrices. Assume that 
Q is merging (in total variation) and c- stable w.r.t. a positive measure n. Then 

1. Any finite product P of matrices in Q is irreducible aperiodic and its 
unique positive invariant measure irp satisfies c~^Tr < np < ctt. 

2. For any uj £ and any n G Z, tt,^ satisfies c^^tt < tt^^ < ctt, i.e., any 
limit row vr' of backward products of matrices in Q satisfies c^^tt < tt' < 

CTT. 
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Proof. (1) As Q is c-stablc w.r.t. tt, there exists a positive measure such that 
for any finite product P of matrices in Q and any n, c~^'!t < ^qP"" < ctt. Since 
Q is merging, we must have hm„_^oo P" — Hp with Hp having constant rows, 
call them np. This implies c~^Tr < irp < ctt. Since tt is positive, np must be 
positive and lim„_j.oo P" = Hp implies that P is irreducible aperiodic. We note 
that (1) is, in fact, a sufficient condition for stability. See [211 Prop. 4.9]. Under 
the hypothesis that Q is merging, (1) is thus a necessary and sufficient condition 
for c- stability. 

(2) Fix oj S Q^. By hypothesis, on the one hand, there exists a positive 
probability measure /io such that c^^tt < /io^m n ^ ctt. On the other hand, 
merging imply that limm^_oo -f^T^^n = and thus, lim„i^_oo Aio-f^m,„ = T^n- 
The desired result follows. □ 

3.4 Product of random stochastic matrices 

For pointers to the literature on products of random stochastic matrices and 
Markov chains in a random environment, see, e.g., [51 [51 ?I7[ 137] and the refer- 
ences therein. We end this section with short comments regarding the simplest 
case of products of random stochastic matrices, i.e., the case where the matrices 
Ki form an i.i.d sequence of stochastic matrices. The backward and forward 
products Kq „ = Kn ■ ■ ■ Ki, Kq = Ki ■ ■ ■ Kn become random variables taking 
values in the set of all iV x stochastic matrices. Although these two sequences 
of random variables have very different behavior as n varies, Kq „ and Kq „ have 
the same law. Takahashi [37] proves that if 

yx,x', lim \Ki {x,y) — {x' ,y)\ = almost surely 

y 

f 

then Kq „ converges in law and the limit law is that of the limit random variable 
lim„_^oo Kq „. Rosenblatt [27] applies the theory of random walks on semigroups 
to show that the Cesaro sums j{x, y) always converge to a constant 

almost surely. The articles [3] |6] discuss similar results under more general 
hypotheses on the nature of the random sequence {Ki)'^ . Unfortunately, these 
interesting results concerning random environments do not shed much light on 
the quantitative questions emphasized here. 

4 Quantitative results and examples 

Informally, the question we want to focus on is the following. Let {K, tt) be an 
irreducible aperiodic Markov kernel and its stationary probability measure. Let 
[Ki)'^ be a sequence of Markov kernels so that, for each i, i^i is a perturbation 
of K with invariant measure tt^ that is a perturbation of tt (what "perturbation" 
means here is left open on purpose). For an initial distribution /ig, consider the 
associated sequence of measures defined by /i„ ~ lk)Ki ■ ■ ■ Kn, n — 1,2,.... 

Problem 4.1. (1) Does total variation merging hold? 
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(2) Does relative-sup merging hold? 

(3) Does there exists c > 1 such that, for n large enough, 

VX,C < —:- < CJ. 

'K(X) 

Obviously, these questions call for quantitative results describing the merging 
times, the constant c and the "large" time n in terms of bounds on the allowed 
perturbations. 

To understand what is meant by quantitative results, it is easier to consider 
a family of problems depending on a parameter representing the size and com- 
plexity of the problem. So, one starts with a family (VIn, KnjTTn) of ergodic 
Markov kernels depending on the parameter N whose mixing time sequence 
{Ti{N, e))^ (say, in total variation) is understood. Then, for each N, we con- 
sider perturbations {K]\[j)°°^-^ of K]\[ with stationary measure 7TN,i close to ttn 
and ask if the merging time of (ifAr,i)^i can be controlled in terms of Ti{N, e). 

Problem 4.2. Let flN = {0, . . . , N}. Let Qat be the set of all birth and death 
chains Q on Vn with Q{x, x + e) £ [1/4, 3/4] for all x,x + e G Vn, e E { — 1,0, 1} 
and with reversible measure tt satisfying 1/4 < {N + l)Tr{x) < 4, a; G Vn- 

1. Prove or disprove that there exists a constant A independent of N such 
that Qn has total variation e- merging time at most AN'^(1 + log^ 1/e)- 

2. Prove or disprove that there exists a constant A independent of N such 
that Qn has relative-sup e- merging time at most AN'^{1 + log^ 1/e)- 

3. Prove or disprove that there exist constants A,C > 1, such that, for any 
N and any sequence {Ki)f^ e Qn, we have 

yx,yenN, Vn>AiV2, 1 <Ko^n{x,y) < ^ 



C(iV-t-l) - ' - N + 1 

Here the time homogeneous model is the birth and death chain Kn with 
constant rates p — q = r = 1/3 and ttn = 1/{N + 1), so that KN{x,y) = 
unless \x~y\ < 1, K{0,0) = K{N,N) = 2/3 and K{x,x) = K{x,x± 1) = 
1/3 otherwise. Of course, it is well known that Ti(_fCjv,e) — Too{Kn , e) — 
N'^{1 + log_|_(l/e)) for small e > 0. Problem 1.2 asks whether or not these mix- 
ing/merging times are stable under suitable time inhomogeneous perturbations 
of Kn and whether or not the limiting behavior stays comparable to that of the 
model chain. To the best of our knowledge the answer is not known and this 
innocent looking problem should be taken seriously. 

There appears to be only a small number of papers that attempt to prove 
quantitative results for time inhomogeneous chains. These include [11] [T31 [HI 
[SB [12] and the authors' works [Ml [HI 1301 IS] • The works [H [HI [21 [H] treat 
only examples of time inhomogeneous chains that admit an invariant measure. 
Technically, this is a very specific hypothesis and, indeed, these works show 
that many of the well developed techniques that have been used to study time 
homogeneous chains can be successfully applied under this hypothesis. 
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4.1 Singular values 

A typical qualitative result about finite Markov chains is that an irreducible 
aperiodic chain is ergodic. We do not know of any quantitative versions of this 
statement. Let K be an irreducible aperiodic Markov kernel with stationary 
measure tt so that /i„ — ^j.qK"' tt as n tends to infinity, for any starting 
distribution /iq. 

If {K,tt) is reversible (i.e., Tr{x)K{x,y) — Tr{y)K{y,x)) and if /3 denotes the 
second largest absolute value of the eigenvalues of K acting on £^ (tt) then (3 < 1 
and 

2||M«-7r||Tv < ll/ioAlb/?" (4.5) 

where ||/io/7r||2 is the norm of /o = ^io/'^ in ^^(tt). This can be considered as a 
quantitative result although it involves the perhaps unknown reversible measure 

TT. 

If {K, tt) is not reversible, the inequality still holds with f3 being the second 
largest singular value of K on ^^(7r) (i.e., the square root of the second largest 
eigenvalue of KK* where K* is the adjoint of K on £^(7r)). However, it is then 
possible that /? = 1, in which case the inequality fails to capture the qualitative 
ergodicity of the chain. 

Inequality (j4.5p has an elegant generalization to the time inhomogeneous 
setting. Let {Ki)"^ be a sequence of irreducible Markov kernels (on a finite 
state space). Fix a positive probability measure /iq (by positive we mean here 
that ^0(2:) > for all x) and set 

In the time inhomogeneous setting, we want to compare this sequence of mea- 
sures (Mn)i° to the sequence of measures {Ko^n{x, •))i° describing the distribu- 
tion at time n of the chain started at an arbitrary point x. 

To state the result, for each i, consider Ki as a linear operator acting from 
£^(/ii) to £^(/ii_i). One easily checks that this operator is a contraction. Its 
singular values are the square roots of the eigenvalues of the operator = 
K*Ki : £'^{^ii) -J> i'^ipi) where K* : f^(^i_i) — £'^{^J-i) is the adjoint operator 
which is a Markov operator with kernel 

Knx,,) = :^^^^q^. 

^i{x) 

We let 

be the second largest singular value of Ki : ^^(^i) — > ^^(^i_i). It is the square 
root of the second largest eigenvalue of the Markov kernel 

Pi(x,y) = -^--^Ki{z,x)K,{z,y)fi,^i{z). (4.6) 



13 



Theorem 4.3. With the notation introduced above, we have 

71 

1 

and 

n 

1 

For the proof, see [H] [29] . The proofs given in [11] and [29] are rather differ- 
ent in spirit, with [11) avoiding the exphcit use of singular values. Introducing 
singular values allows for further refinements and is useful for practical esti- 
mates. See [IHlll^. When coupled with the hypothesis of c-stability, the above 
result becomes a powerful and very applicable tool. See, e.g., [3S1 Theorem 4.11] 
and the examples treated in [351 130] • Unfortunately, proving c-stability is not 
an easy task. 

A good example of application of Theorem 14.31 is the following result taken 
from [29]. We refer the reader to [29] for the proof. 

Theorem 4.4. Fix 1 < a < A < oo. Let Qn{o-,A) be the set of all constant 
rate birth an death chains on {0, . . . , N} with parameters p, q, r satisfying p/q & 
[a,A\. The set QN{a,A) is merging in relative-sup with relative-sup e-merging 
time bounded above by 

Too(e) <C(a,A)(iV + log+l/e). 

In contrast, note that the set Q = {Qi,Q2} where Qi is the Pi,qi constant 
rate birth and death chain on {0, . . . , N} and pi = q2, qi = P2 cannot be merging 
faster than N'^ because the product K = Q1Q2 is, essentially, a simple random 
walk on a circle with almost uniform invariant measure. See [29) Example 2.17]. 

It may be illuminating to point out that Theorem 14.31 is of some interest 
even in the time homogeneous case. Suppose K is irreducible aperiodic kernel 
with stationary measure tt and second largest singular value a on i'^(7r). Then 
we have 

< [7r(x)^(2/)]-i/2^". (4.7) 

One difhculty attached to this estimate is that both [7r(a;)7r(?/)]^^/^ and a de- 
pends on the perhaps unknown stationary measure tt. 

Consider instead an initial measure /xq > and set /x„ = /ig-fC". Then we 
also have 

n 

<[^^a{x)^in{y)]-^'^\{a, (4.8) 
1 

where ct^ is the second largest singular value of K : ^^(/^i) In 
particular, setting /Iq — m.vaxipi'oix)} , 

n 

< KMn(y)]"'/'n^' (4.9) 

1 



A*n(y) 



j^"(x,;/) 
7r(y) 



K'^ix,y) 
Mn(y) 



Mn(y) 
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The estimates (|4.8p - (|4.9p have the disadvantage that each ai depends on /io 
through ^i-i and jii. They have the advantage that they do not depend in 
any direct way of tt. From a computational viewpoint, they offer a dynamical 
estimate of the error in the approximation of tt by /i„. 

4.2 An example where stability fails 

In this section, we present a simple example that indicates why stability is a dif- 
ficult property to study from a quantitative viewpoint. Let i7jv = {0, 1, ■ • • , N}, 
N = 2n + 1. Fix p,q,r > with p + q + r = 1, p ^ q, and rji G [0, 1). Consider 
the Markov kernels Qi given by 



Qi{2x,2x + 1) 




X — 


0,. 


. ,n 


Qi{2x,2x ~ 1) 


= q, 


X — 


1,. 


.,n 


Qii2x- l,2x) 


= q, 


X — 


1,. 


.,n 


Qi{2x + l,2x) 




X = 


0, . 


. , n — 1 


Qi{x,x) 


= r, 


X = 


1,. 


.,2n, 



and 

Qi(0,0) = r7 + r, Qi(7V,iV) =r,i, Q,(N, N - 1) = 1 - rj,. 



Figure 4: The chain with kernel Qi 



q + r o^g^g^g g^g^g^o 

T ^ ^ y T f^r^i 

This chain has reversible measure tti given by 
^i(O) = • ■ • = ^i(A^ - 1) = (1 - m)p-'7:i{N) = 



-1 /.n_ (l-^i)P"' 



7V(1 -r/i)p-i + 1' 



Next, we let Q2 be the kernel obtained by exchanging the roles of p and q 
and replacing 771 by 772 G [0, 1). Obviously, this kernel has reversible measure 7r2 
given by 



712(0) = • • • = 7r2(iv - 1) = (1 - m)q''^2iN) - 



N{l-rj2)q-^ + 1 



As long as p, q are bounded away from and 1 and r/i , 772 are bounded away 
from 1 these kernels Qi, Q2 can be viewed as perturbations of the simple random 
walk on a stick (with loops at the ends). Their respective invariant measures 
are close to uniform. In fact, they are uniform if rji = q + r, r]2 — p + r. 

It is clear that, even if r77i772 = 0, for any sequence {Ki)f with Ki e {Qi, Q2} 
we have 

min {Krn,m+2N+iix,y)} > (min{p, g})^^+^ > 0. 
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Hence, if we let /io = uhe the uniform measure and set /Lt„ = /xo-ft'o,n then there 
exists a constant c = c{p, q, N) G (1, oo) such that 

Vn, < IJLn{x) < c. 

Further, it follows that any such sequence {Ki)f is merging in total variation 
and in relative-sup. 

Nevertheless, we arc going to show that the stability property fails at the 
quantitative level as N tends to infinity. For this purpose, we compiite the kernel 
of if = QiQ2- To understand K, it is useful to imagine that the elements of 
{0, . . . , iV} arranged on a circle with the even points in the upper half of the 
circle and the odd points on the lower half of the circle. The only points on the 
horizontal diameter of the circle are and N. 

The kernel K is given by the formulae: 

K{2x, 2x + 2)=p^, K{2x + 2, 2x) = , 
K{2x + 1, 2a; + 3) = g^ K{2x + 3, 2x + 1) = p^, 
K{Q,0) = 2pq + r, K{x,x) = 2pq + r'^ , 
K{x, x + l) = K{x + l,x)= r{p + q) 
K{Q, 1) = 9^ + r(l - r), K{1, 0) = / + r(l - r), 
K{N ~l,N) ^pr]2+rq, 
K{N, N- 1)^(1- rj2)m + (1 - m)r, 
K{N - 2, iV) = q\ K{N, N-2) = il- 7?i)p, 
K{N - 1, AT - 1) = p(g + 1 - ?72) + r^, 
K{N,N)=r]ir]2 + {^-Vi)'l- 

The following special cases are of interest. 

(i) r = 0, 771 = g, 772 = p- In this case tti = ^2 is uniform and K is the 
kernel of a nearest-neighbors random walk on the circle with transition 
probabilities p^, q"^ and holding 2pq. Of course, this chain admits the 
uniform measure as invariant measure. 

(ii) r = 0, ?7i = 772 = 0. In this case, K is essentially the kernel of a p' = 
P^jQ' = 9^1'"' = 2p(Z birth and death chain. More precisely, after writing 

Xo = N,Xi ^ N - 2,..., Xn-l = l,Xn = 0, Xn+1 = 2, . . . , Xn-1 = N - 

3, Xjv = -/V — 1, we have 

K{xi,Xi+i) = p"^, K{xi,Xi-i) = q^, K{xi,Xi) = 2pq 

except for K{xo,xi) = p, K{xo,xo) = q, K{xn,xn) =p + pq. This chain 
has invariant measure 

Tr{xi) = 7r{xo)p~\p/qy\ i = l,...,N. 



X = 0, . . .n — 2, 
X = 0, . . . ,n — 2, 
x = l,...,N -2, 
x = l,...,N -2, 
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Using the same notation as in (ii) above, we can compute the invariant measure 
TT of K when r = for arbitrary values of r]i,r]2- Indeed, tt must satisfy the 
following equations: 

^{xi) = 2pqiT{xi) + p^irixi-i) + q'^iTi{xi+i), i = 2,...,N-l 

Tr{xi) = 2pqTr{xi) + {1 - r]i)p7r{xo) + q'^n{x2) 

Tr{xo) = (r/i?72 + (1 - ■ni)Q)'^{xo) + q^T^ixi) + p??27r(a;jv) 

7r(a;jv) = ^(9 + 1 - ??2)7r(a;jv) + (1 - r?2)??i7r(a;o) +p^7r(a;jv-i)- 

Because of the first equation, we set Tr{xi) = a + p{p/q)'^'^ for i = 1, . . . , N. This 
gives 

(l-raW(xo) = + 
{p-Vi{m-q))T^{xo) = q^ia + (3ip/qf)+pmia + Pip/qf^) 
(1 - ?72)r?i7r(a;o) = a(g^ + piV2 - p)) + PmPip/qf^ ■ 

Since the equations of the system tt = ttK are not independent, the three 
equations above are not either. Indeed, subtracting the last equation from the 
second yields the first. So the previous system is equivalent to 

(1 - r?i)p"V(a;o) = /3 + a 
(1 - ??2)?7i7r(a;o) = a{q^ +p{r]2 -p)) + PV2f3{p/qf^ ■ 

Hence, recalling that q^ — p^ = q — p since p + q = 1, 

_ (l-r;i)(g/p)- (1-772) . . 
^ g-p + pr?2(l-(p/9)2^)^ °^ 



and 



q-p + pr]2{l- {p qY^) 



When r?i = ?72 = (resp. rji = q,T]2 = p), we recover a = 0, (3 = p~^'7t{xo) 
(resp. a = Tr{xo), (3 = 0). 

The denominator q — p + pri2{l — (p/qY^) is positive or negative depending 
on whether q > p or q < p. By inspection of these formulae, one easily proves 
the following facts (the notation Xi refers to the relabelling of the state space 
introduced in (ii) above). 

• Assume that q > p, r = 0. For any fixed 771 > 0, there is a constant 
c = c{p, q, r/i, 772) G (1, 00) such that, for all large enough A^, we have 

Vx, c"^ < (iV + l)7r(a;) < c. 

If 771 = then there is a constant c = c{p, q, 772) G (1, 00) such that, for all 
large enough N, we have 

Vxi, < {q/pf'-K{xi) < c. 
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• Assume that q < p, r = 0. For any fixed 772 > 0, there is a constant 
c — c{p, q, 771, 772) G (1, 00) such that, for all large enough N , we have 

Vx, < (A^ + l)7r(a;) < c. 

If 772 = then there is a constant c = c(p, 9, 771) G (1, 00) such that, for all 
large enough TV, we have 

Vx„ < (q/pY^'-^'Kix,) < C. 

On the one hand, when r = 771 = 772 = and < p ^ q < I are fixed, there 
are no constants c independent of N for which the set Q = {Qi, Q2} is c-stable. 
One can even take pN,qN so that pn/qN = 1 + aN^"" + o{N~^) as N tends 
to infinity with a > and < a < 1. Then Qi and Q2 are asymptotically 
equal but there are no constants c independent of N for which Q = {Qi, Q2} is 
c-stable. 

On the other hand, when < p,q,r < 1, 771 = q + r and 7/2 — P + f, the 
uniform measure is invariant for both kernels and Q is 1-stable. 

It seems likely that for fixed 771, 7/2, r,p, q with < p,q < 1 and either r > 
or 771772 > the set Q is c-stable but we do not know how to prove that. 

5 Time dependent edge weights 

In this section, we consider a family of graphs Qn — {^N,E]y). These graphs 
are non-oriented with no multiple edges (edges are pairs of vertices e = {x, y} 
or singletons e = {x}). We assume connectedness. We let d{x) be the degree of 
X, i.e., d{x) = #{e G E : e B x} and set 

d{x) - 



E.d{x)- 

For simplicity, we assume that these graphs have bounded degree, i.e., 

VA, Vx G Qn, d{x) < D, 

uniformly in A^. A simple example is the lazy stick of length (A^ -f 1) as in 
Problem 14.21 and Figure [5] 

Figure 5: The lazy stick 
O Q (J <J ® O 
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5.1 Adapted kernels 

For any choice of positive weights w = {'Wf,)eeE on f/„, we obtain a reversible 
Markov kernel K{w) with support on pairs {x, y) such that {x, y} G E, in which 
case 

W{x,y} 



K{-w){x,y) 



The reversible measure is 



7r(w)(x) = c(w) ^^We, c(w)=^^i 



We 

e3x X e3x 



.(w)(.)=c(w)-i:.e<^i:^<6.(.). 



For instance, picking w = 1, i.e., = 1 for all e G E, we obtain the kernel 

K^^{x,y) = K{l){x,y) = lE{{x,y}) /d{x) of the simple random walk on the 
given graph. The reversible measure for K^^ is 7r(l) = 6. 
Set 

i?(w) = max {we/wg' : e,e' € E} . 
Observe that i?(w) < b implies 

Va;, b-^S{x) < Tr{w){x) < bS{x). (5.10) 

For instance, to prove the upper bound, let wq = minjiUe} and write 

^ \ - We 

^x di^) = 

e3x ^-^x ^ ' e3x 

The proof of the lower bound is similar. Further, we also have 

yx,y, (£>6)-i7r(w)(2/) < 7r(w)(a;) < D67r(w)(y). (5.11) 

Indeed, Y.eBx'^'^ ^ ^^'^'o ^ ^^Eeay'^e- 

For any N and > 1, set 

Q{gN,b) = {K{w):R{w)<b}. 

For any N, b > 1 and fixed probability measure tt on fijv, set 

Q{QN,b,n) = {K{^f^) : i?(w) < 6, 7r(w) = tt}. 

The set of weight Q{GN,b,TT) may well be empty. However, we can use the 
Metropolis algorithm construction to prove the following lemma. 

Lemma 5.1. Assume that {x} € E for all x (i.e, the graphs Qn have a loop at 

each vertex) and that ^ t^{x)/S(x) < a. Then the set Q{QN,a'^{b^ + bD),'!r) 
is non-empty for any b > 1. It contains a continuum of kernels K{w) for any 
b>l. 
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Proof. Starting from any weight v with i?(v) < b, wc define a new weight w by 
setting 

y{x,y\ eE,x^y, ^{a;,^} = mm <^ , w ^ , . w ^ ^ 

Itt v) a; TT v) yjj 



and 



, . , , . f 7r(x) 7r(?/) 1 

±7l U(v)(a;) 7r(v)(y) J 



It is clear that 7r(w) = tt (Indeed, K{w) is the kernel of the Metropolis algorithm 
chain for tt with proposal based on K(y)). Further, since 



■ f 7r(x) 7r(y) ) , n . n ^{x} \ 
> vt^ „i mm < , , , , , , ,, , > < Tr(x) civl , ,, , 

7r(a;)w{x} 



we have 



7r(v)(x) 

Now, since a^^6{x) < tt{x) < aS{x) and v S Q{GN,b), we obtain 



and 



V{x,2/} G £:,a;', max | ^^^i^, -^^^^ I < a^foL*. 



Hence i?(w) < a2(&3 + bD) and i\:(w) e 2(5^,02(63 + bD),TT) as desired. □ 



5.2 Time homogeneous results 

For each N, let (Tat be the second singular value of {K^^, S), i.e., the second largest 
eigenvalue in absolute value of the simple random walk on Qn. For instance, for 
the "lazy stick" of FigureEJ 1 — ctat is of order l/N"^. For any w, let (t(w) be the 
second largest singular value of (iir(w), 7r(w)). The following lemma concerns 
the time homogeneous chains associated with kernels in Q{QN,b). 

Proposition 5.2. For any h>l and any K{w) £ Q{GN,b) 

6^2(1 -<jn) <l- cr(w). 
In particular, uniformly over w £ Q{GN,b), 

<6C'AAr(l- 6-2(1 -aAr))", (5.12) 

with An — J^x'^i^)' ^* — niina;{d(a;)}. 



Kiwrix,y) 
7r(w)(y) 
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Proof. This is based on the basic comparison techniques of [7]. In the present 
case, it is best to compare the lowest and second largest eigenvalues of K^^, 
call them /3_ and /3i, respectively, with the same quantities /3_(w) and /3i(w) 
relative to if (w). The relation with the singular value (t(w) is given by cr(w) = 
max{— /3_(w), /3i(w)}. For comparison purpose, one uses the Dirichlet forms 
(recall that edges here are (non-oriented) pairs {x, y}) 



1 

v) 

e={x,y} 

and 



fw(/,/) = ^ ^ |/(x)-/(y)|V 



Clearly, for any /, 



C(/,/)<^^w(/,/), Var.(^)(/)<^Var,(/). (5.13) 
Aat c(w) 



This yields 1 — /3i < 6^(1 — /3i(w)). A similar argument using (the sum here is 
over all x, y with {x, y} £ i?, which explains the i factor) 

•^^^^'^^=2^ ^ |/(x) + /(2/)pu;|,,,} 

yields 1 + /3_ < 5^(1 + /3_(w)). This gives the desired result. □ 

Example 5.3. For our present purpose, call "(c?, e)-expander family" any infi- 
nite family of regular graphs ^at of fixed degree d, with |rijv| = tending 
to infinity with N and satisfying ct^t < 1 — e. See [T71 for various related 
definitions and discussions of particular examples. Proposition 15.21 shows that 
for any K{w) G Q{GN,b), we have 



if(w)"(x,y) _^ 



7r(w)(j/) 



2\n 



<b\nN\{l-e/b') 



Let us point out that, beside singular values , there are further related 
techniques that yield complementary results. They include the use of Nash 
and logarithmic Sobolev inequalities (modified or not). See O El [28l [30] . For 
instance, to show that on the "lazy stick" Qn of Figure^ any chains with kernel 
in Q{GN,b) converges to stationarity in order iV^, one uses the Nash inequality 
technique of [5]. 



5.3 Time inhomogeneous chains 

A fundamental question about time inhomogeneous Markov chains is whether 
or not a result similar to (|5.12l) holds true for time inhomogeneous chains with 
kernels in Qn{Qn -b). Little is known about this. 
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Fix 6 > 1. Let (^^1)1° be a sequence of Markov kernels in Q{GN,b) and 
Km,n be the associated iterated kernel. Recall that the property "ctat < 1" is 
equivalent to the irreducibility and aperiodicity of K^^. Because all the kernels 
in Q{GN,b) are (uniformly) adapted to the graph structure Qjy, there exists 
£ = £{N,b) and e = e{N,b) > such that, for all n, Kn,n+t.{x,y) > e. As 
explained in Section 12.31 this implies relative-sup merging for any such time 
inhomogeneous chain. However, this result is purely qualitative. No acceptable 
quantitative result can be obtain by such an argument. 

Problem 5.4. Fix reals D,b > 1. Prove or disprove that there exists a constant 
A such that for any family with maximal degree at most D, any sequence 
(Ki)"^ with Ki G Q(^?Ar,6), any initial distributions /io,/io and any e > 0, if 

n>A{l- (7Ar)-i(log|r!Ar| + log+(l/e)) 

then /i„ = Mo^o.n and = Mo-^o.n satisfy 



max 



- 1 



< e. 



This is an open problem, even for the "lazy stick" of Figure El It seems 
rather unclear whether one should except a positive answer or not. 

Next, we consider another question, quite interesting but, a priori, of a 
different nature. Recall that, given Qn, S denotes the normalized reversible 
measure of K^, . 

Problem 5.5. Fix reals D,b > 1. Prove or disprove that there exists a constant 
A > 1 such that for any family G^ with maximal degree at most D, any sequence 
{Ki)^with Ki € Q{QN,b) and any initial distributions hq, if 

n>A{l-aN)-\log\nN\) 
then fin = /io^o.ri satisfies 

6{x) 

In words, a positive solution to Problem 15.41 vields the relative-sup merging 
in time of order at most 74(1 — apf)^^ log \^n\, uniformly for any time inhomo- 
geneous chain with kernels in Q{GN,b) whereas a positive solution to Problem 
15.51 would indicate that, after a time of order at most A{1 — a^)^^ log\^lf^i\, 
uniformly for any time inhomogeneous chain with kernels in Q{GN,b) and for 
any initial distribution /ip, the measure /i„ = Mo-^o.n is comparable to 6. In 
fact, because of the uniform way in which Problem [53] is formulated, a positive 
answer implies that the measure 6 is ^-stable for Q{GN,b). 

At this writing, the best evidence for a positive answer to these problems is 
contained in the following two partial results. The first result concerns sequences 
whose kernels share the same invariant distribution. For the proof, see [55]. 
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Theorem 5.6. Fix reals D,b > 1 and measures ttn on VIn- Assume that Qn 
has maximal degree at most D and that Q(^yAr, 6, ttjv) is non-empty. Under 
these circunstances, there is a constant A — A{D, b) such that for any e > 
0, any sequence (Ki)'^ with Ki e Q(^jv, ■"'at) md any pair fJ-o,fJ-Q of initial 
distributions, if 

n>A{l- <TAr)-i(log |1]„| +log+(l/e)) 
then fin = fJ-oKo^n and fi[^ = fi'^K^^n satisfy 



max 



< €. 



Note the the hypothesis that Q{Qn 1^,11^) is non-empty imphes that b 
t^n/S < b. The second result assumes c-stabihty. For the proof, see [50] . 



< 



Theorem 5.7. Fix reals D^b,c > 1. Assume that Qn has maximal degree at 
most D . Let (Ki)'^ be a sequence of kernels on^ljq with Ki G Q(^jv,&)- Assume 
that the distribution S on fl^ is c- stable for {Ki)^ . Then there exists a constant 
A — A[D, b, c) such that for any e > and pair fiQ, (i'q of initial distributions, if 

n>A{l- aNr\\og IQnl + log+(l/e)) 

then fin = Mo^o.ri and fi',^ = Mo^o,n satisfy 



max 



l^'nix) 



fJ-nix) 



- 1 



< e. 



Theorem 15.61 can be viewed as a special case of Theorem 15.71 Indeed, if 
QiGN,b,T^N) is not empty then we must have b~^S < ttjv < b^S so that 6 is 
a 6-stable measure for any sequence of kernels in Q{Qn ^^,71^)- By Lemma 
15. 1[ it is not difficult to produce examples where Theorem 15.61 applies. Finding 
examples of application of Theorem 15.71 (where the KiS do not all share the 
same invariant distribution) is a difficult problem. 

Under the stability hypothesis of Theorem 15.71 methods such as Nash in- 
equalities and logarithmic Sobolev inequality can also be applied. See [31)] . 



Figure 6: The underlying graph for the kernels Qi,Q2 of Section H?^ 
(3 • • • • • • • 



Remark 5.8. Consider the kernels Qi, Q2 of Section with fixed p, q, r, rfi,rf2 
with r — rfi — rf2 — and < p ^ g < 1. The kernels Qi,Q2 are adapted to 
the graph structure of Figure [6] We proved in Section 14.21 that stability fails 
for Q = {Qi, Q2}- Even on the "lazy stick" of Figure [SJ we do not understand 
whether stability holds or not. An interesting example of stability on the lazy 
stick is proved in 29!. This example involves perturbations that are localized 
at the ends of the stick. Further examples are discussed in [3T] . 
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